Distilling dialogues - A method using natural dialogue corpora for dialogue systems development
نویسندگان
چکیده
We report on a method for utilising corpora collected in natural settings. It is based on distilling (re-writing) natural dialogues to elicit the type of dialogue that would occur if one the dialogue participants was a computer instead of a human. The method is a complement to other means such asWizard of Oz-studies and un-distilled natural dialogues. We present the distilling method and guidelines for distillation. We also illustrate how the method affects a corpus of dialogues and discuss the pros and cons of three approaches in di erent phases of dialogue systems development.
منابع مشابه
Using the Process of Distilling Dialogues to Understand Dialogue Systems
Distilled dialogues, i.e. re-written natural dialogues, are a useful complement to dialogues collected in Wizard of Oz-experiments or in natural settings for development of dialogue systems. However, the distillation process itself also provides insights on human-computer interaction and on properties of dialogue systems. In this paper we present the distillation process, including how the guid...
متن کاملThe Negochat Corpus of Human-agent Negotiation Dialogues
Annotated in-domain corpora are crucial to the successful development of dialogue systems of automated agents, and in particular for developing natural language understanding (NLU) components of such systems. Unfortunately, such important resources are scarce. In this work, we introduce an annotated natural language human-agent dialogue corpus in the negotiation domain. The corpus was collected...
متن کاملAutomatic annotation of context and speech acts for dialogue corpora
Richly annotated dialogue corpora are essential for new research directions in statistical learning approaches to dialogue management, context-sensitive interpretation, and contextsensitive speech recognition. In particular, large dialogue corpora annotated with contextual information and speech acts are urgently required. We explore how existing dialogue corpora (usually consisting of utteranc...
متن کاملCorrelations between dialogue acts and learning in spoken tutoring dialogues
We examine correlations between dialogue behaviors and learning in tutoring, using two corpora of spoken tutoring dialogues: a human-human corpus and a human-computer corpus. To formalize the notion of dialogue behavior, we manually annotate our data using a tagset of student and tutor dialogue acts relative to the tutoring domain. A unigram analysis of our annotated data shows that student lea...
متن کاملAutomatic analysis of real dialogues and generating of training corpora
The development of computerized information retrieval dialogue systems communicating with the user in natural language requires the implementation of an effective training procedure with the aid of which the main modules of the dialogue system can be partly automatically developed. The presented paper describes an attempt to create the generating sentence templates automatically, using a specia...
متن کامل